Skip to content

Improvements over chuking refactoring#4533

Open
h-mayorquin wants to merge 4 commits intoSpikeInterface:mainfrom
h-mayorquin:improve_time_series_chuking_refactor
Open

Improvements over chuking refactoring#4533
h-mayorquin wants to merge 4 commits intoSpikeInterface:mainfrom
h-mayorquin:improve_time_series_chuking_refactor

Conversation

@h-mayorquin
Copy link
Copy Markdown
Collaborator

@h-mayorquin h-mayorquin commented Apr 20, 2026

This PR fixes the Kilosort4 regression reported on #4472. The root cause is that #4472 turned write_binary_recording into a plain alias for write_binary(chunkable, ...), so passing recording=rec as a keyword argument no longer matched the parameter name and raised TypeError. The immediate fix is to make write_binary_recording a real wrapper again, with recording as its actual parameter:

def write_binary_recording(recording, file_paths, **kwargs):
    return _write_binary(recording, file_paths, **kwargs)

I think this points to a more general principle worth keeping in the codebase: the SpikeInterface public API should stay in SpikeInterface vocabulary. A user saving a recording should see recording in signatures, docstrings, and IDE autocomplete, not the implementation concept chunkable. Keeping the generic functions private (underscore-prefixed, living in time_series_tools) and exposing thin recording-centered wrappers as the public surface is what makes this class of regression structurally hard to reintroduce: the parameter name a user types is the one that lives in the signature, and cannot drift when the internals are refactored. write_memory_recording is converted the same way for consistency.

The second change is the naming of the abstraction introduced in #4472. We touched on this in the maintenance meeting with @alejoe91: the real abstraction here is not chunkable in general, it is a time series over which we run processes in chunks. My proposal is that the class names should reflect that rather than staying at the overly abstract Chunkable level. Concretely:

  • ChunkableMixin is renamed to TimeSeries: what this class actually models is a time series. Slicing methods like get_data(start_frame, end_frame, ...) are a natural property of a sliceable time series, not an extra "chunkable" capability. The executor does the chunking; the class just has to expose a time-series contract.
  • ChunkableSegment is renamed to TimeSeriesSegment: the segment class contains no chunking code at all. It is only time handling (sampling_frequency, t_start, time_vector, get_times, sample_index_to_time, time_to_sample_index), and its own docstring describes it as providing "methods to handle time kwargs." Dropping Chunkable matches what the class actually is.
  • ChunkExecutor is renamed to TimeSeriesChunkExecutor: this is where the verb "chunk" belongs, because the executor is what performs the chunking. Naming it TimeSeriesChunkExecutor also makes the axis of chunking explicit (it chunks along time, not arbitrary axes) and ties the vocabulary together: a TimeSeries / TimeSeriesSegment is chunked by a TimeSeriesChunkExecutor. The module chunkable_tools.py and the helper divide_chunkable_into_chunks are renamed to time_series_tools.py and divide_time_series_into_chunks to keep the import path and internal utilities consistent with the new names.

The two ideas come together in the zarr writer. Before this PR the public function was literally named write_chunkable_to_zarr: the function name itself asked a SpikeInterface user to learn the internal concept "chunkable" just to save a recording. Now the public entry point is write_recording_to_zarr(recording, ...) and the generic implementation has moved to a private _write_time_series_to_zarr(...), so the helper name is honest about the TimeSeries contract it operates on. The behavior is identical; only the surface a user sees changes. When BaseImaging lands, write_imaging_to_zarr(imaging, ...) will be a second thin wrapper over the same private core, and imaging users will never see chunkable either.

@alejoe91 alejoe91 added the core Changes to core module label Apr 20, 2026
def __init__(
self,
chunkable: ChunkableMixin,
chunkable: TimeSeries,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
chunkable: TimeSeries,
time_series: TimeSeries,

?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@alejoe91
Copy link
Copy Markdown
Member

I agree Chunkable is too abstract! I like TimeSeries better. @samuelgarcia @chrishalcrow do you like the name?

@h-mayorquin should we also rename then the chunkable arguments to time_series in all the functions? I think we should right?

@h-mayorquin
Copy link
Copy Markdown
Collaborator Author

I agree Chunkable is too abstract! I like TimeSeries better. @samuelgarcia @chrishalcrow do you like the name?

@h-mayorquin should we also rename then the chunkable arguments to time_series in all the functions? I think we should right?

Yes, I agree.

@h-mayorquin h-mayorquin marked this pull request as ready for review April 20, 2026 16:23
@chrishalcrow
Copy link
Copy Markdown
Member

In case this takes a some time to merge, I've done a quick dirty patch to fix the kilosort issue here #4540

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Changes to core module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants